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EFFECTIVE EQUIDISTRIBUTION OF SHEARS AND 

APPLICATIONS 

DUBI KELMER AND ALEX KONTOROVICH 


Abstract. A “shear” is a unipotent translate of a cuspidal geodesic ray in the 
quotient of the hyperbolic plane by a non-uniform discrete subgroup of PSL(2,R), 
possibly of infinite co-volume. We prove the regularized equidistribution of shears 
under large translates with effective (that is, power saving) rates. We also give 
applications to weighted second moments of GL(2) automorphic L-functions, and 
to counting lattice points on locally affine symmetric spaces. 
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1. Introduction 

In this paper, we prove the effective (meaning, with power savings rate) equidis¬ 
tribution of “shears” (see below for definitions) of “cuspidal” geodesic rays on hyper¬ 
bolic surfaces. Our proofs are quite “soft,” in that we only use mixing and standard 
properties of Eisenstein series, rather than explicit spectral decompositions, special 
functions, or any estimates on time spent near a cusp. This allows us to extend our 
methods to surfaces of infinite volume (in fact the proofs are surprisingly easier in 
this case). As a direct consequence, we complete the general problem of obtaining ef¬ 
fective asymptotics for counting (in archimedean norm balls) discrete orbits on affine 
quadrics; as discribed in §1.3.1, exactly two lacunary settings remained unsolved, 
which are settled in this paper. Another application is to weighted second moments 
of GL(2) automorphic L-functions. 
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(a) Lattice case: T = PSL 2 (Z) 



Figure 1. A shear of the cuspidal geodesic ray 


When the surface has infinite volume, we discover two new and completely unex¬ 
pected phenomena: (1) the orbit count asymptotic can be proved with a uniform 
power savings error in congruence towers without inputting any information on the 
spectral gap. 1 And even more surprisingly, (2) orbits in such towers are not uni¬ 
formly distributed among different cosets! The uniformity in cosets, were it true, 
would have allowed the application of an Affine Sieve in this archimedean ordering 
(see, e.g. [Konl4]); our observation shows that the Affine Sieve procedure cannot be 
applied directly here, as the standard sieve axioms are not satisfied. 2 

1.1. Statements of the Theorems. 

Our main equidistribution result is the following. Let V be a discrete, Zariski- 
dense, 3 geometrically finite 4 subgroup of G := PSL 2 (M), and assume that the hyper¬ 
bolic surface r\HI, which may have finite or infinite volume, has at least one cusp. 
In particular, this forces the critical exponent 5 S of T to exceed 1/2; this will be our 
running assumption throughout. 

The base point 

x 0 e t^t/h) = r\G 

in the unit tangent bundle determines the visual (under the forward geodesic flow) 
limit point a on the boundary T\<9IHI. We call the point x 0 , as well as its forward 
geodesic ray, x 0 • A + , cuspidal if a is a cusp of T; here 

>1+ := {(VO :■>>!}• 

4 By “spectral gap” we always mean the distance between the first eigenvalue Ai and the base 
eigenvalue A 0 of the hyperbolic Laplacian; see §2.5. 

2 Of course one can instead order by wordlength in T, as is done in [BGS10], to restore equidistri¬ 
bution and apply the Affine Sieve. 

3 Equivalently, non-elementary, that is, not virtually abelian. 

4 For surface groups, being geometrically finite is equivalent to being finitely generated. 

5 Roughly speaking, the critical exponent measures the asymptotic growth rate of T; see §2.1. 
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(Note that we make no demands on the negative geodesic flow from xo.) Given such 
a ray, we define its shear (the ray is no longer geodesic), at time Tel, by: 

Xo • • St C r\G, 

where 

:= a_i_nT, a y = Ux = I 1 ?)' ( L1 ) 

For example, if T = SL 2 (Z), then the base point x 0 = (i,t) has visual limit point 
a = oo, and hence is cuspidal. The shear at time T of the forward ray from x 0 , 
projected to r\H, is then simply the Euclidean ray {re l9 } r> i, where cot# = T. See 
Figure la for an illustration of this ray and its projection mod PSL 2 (Z). Similarly, 
Figure lb gives the same picture but for a thin group T. 

We are interested in the behavior of such shears as T —y oo (and similarly for 
T —» — oo). To this end, define the measure /ir on a smooth, compactly supported 
observable T e G“(r\G) by 

/ir( v I ; ) := / T(x 0 ■ a ■ Sr)da = f T(xo • a y ■ St) — ■ (1.2) 

JaeA+ J 1 V 

A slight simplification of our main result (see Theorem 3.2) is the following 

Theorem 1.3. 

Lattice Case: Assume that the quotient F\G has finite volume. Then there exists 
an 7] > 0, depending on the spectral gap for T, so that 

M*) = log T- mG (f)+^(f) + 0 ? (n), (1.4) 

as T —>■ oo. Here /ir\G is the probability Haar measure and (i^f is a certain “regu¬ 
larized Eisenstein” distribution (see Remark 1.8 below). 

Thin Case: Assume that F is thin, that is, the quotient T\G has infinite volume. 
Then there exists an rj > 0, depending only on the critical exponent 5 ofT, and not 
on its spectral gap (!), so that 

Arm = TEisW + O* (T-") , (1.5) 

as T —» oo. Here p,Eis is an (un-regularized) Eisenstein distribution. 

Some comments are in order. 

Remark 1.6. For simplicity, we have stated Theorem 1.3 for compactly supported test 
functions T, but our method applies just as well to a larger class of square-integrable 
functions with at least polynomial decay in the cusp a (to ensure convergence of 
/t t (T)); see §3. 

Remark 1.7. Throughout we make no attempt to optimize the various error exponents 
r), as can surely be done with a modicum of effort; our point is to illustrate a soft 
method which is powerful enough to obtain new results with power savings errors. 
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Remark 1.8. Let us make the Eisenstein distributions arising in (1.4)~(1.5) less mys¬ 
terious. These distributions are actually measures when T is right A'-invariant; we 
restrict attention to this case to simplify the discussion below. First assume that T 
is a lattice, that x 0 = (i, t) with a = oo a cusp of T of width 1 , and let Too = ( 1 f) 
be the isotropy group of a in T. Then one has the standard Eisenstein series 

E(z,s) '■= ^ 3m(7^) s , (£Kc(s) > 1), 

7eroo\r 


which is well known [Sel56] to have meromorphic continuation and a simple pole at 
s = 1 with residue vo^T/HI)” 1 . Thus the function 


E(z,s ) 


E(z,s ) 


1 

vol(r\H) (s - 1) 


(1.9) 


is regular at s = 1; for example, when T = PSL 2 (Z), we have (see, e.g., [IK04, (22.42), 
(22.63)—(22.69)]) 


E(z,l) = 1 ( 27 - 2 /( 2 ) - log(4j/|i;(z)| 4 )V 


( 1 . 10 ) 


where 7 = 0.577 ■ ■ • is Euler’s constant, ((s) is the Riemann zeta function, and rj(z) 
is the Dedekind eta function. Then the measure fig- is simply given by: 


Note that log 1 77 ( 2)1 grows like y in the cusp, so - is also a non-finite measure. See 
(3.40) for the definition when T is not A'-finite. 

In the thin case of (1.5), the Eisenstein series is itself regular at s = 1, that is, we 
can simply take E(z , s ) = E(z, s); the spectral contribution is then all of lower order, 
so the power savings obtained in (1.5) is independent of any knowledge of a spectral 
gap for T. 


Remark 1.12. The first factor logT on the right side of (1.4) is a manifestation of 
the logarithmic divergence of the measure ht- In the lattice case, a statement of the 
form 

Pt(^) = polynomial (log T) ■ Pt\g{^) • ^1 + o(l)^ (1.13) 

was suggested in work of Duke-Rudnick-Sarnak [DRS93, see below (1.4)]. Recently, 
Oh-Shah [OS14] used a purely dynamical method to prove (a variant of) (1.13) with 
a log-savings rate, that is, with o(l) replaced by 0(1/log T). With such a rate 
it is of course impossible to see the second-order main term (that is, the regularized 
Eisenstein distribution), and this identification will be key to some of our applications 
below. Moreover, it is hard to imagine how a quantity like (1.10) can be captured 
using only dynamics; our approach is quite different. 


Before discussing the proof of Theorem 1.3, we first give some applications. 
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1.2. Application 1: Weighted Second Moments of GL(2) L-functions. 

Integrals like /jlt{ \P) arise naturally in Sarnak’s approach (“changing the test vec¬ 
tor”) [Sar85] for second moments of L-functions (see also, e.g., [G 0086 , VenlO, MV10, 
DG10, DGG12]). We illustrate the method in the simplest case of a weight-/c holo- 
morphic Hecke cusp form / on PSL 2 (Z), though the method works just as well for 
any GL(2) automorphic representation. 

Write the Fourier expansion of / as 

f( z ) = 5^a/(n)e(raz), 

n> 1 


where a/(l) = 1 and the coefficients a,f(ri) are multiplicative, satisfying Hecke rela¬ 
tions, and the Ramanujan bound |a/(p)| < 2p ( fc_1 )/ 2 [Del74]. The standard L-function 

of/,' 


L(f,s) 


== E 

n> 1 


«/(»-) 

n s+(k- 1)/2 ’ 


converges for SRc(s) > 1, has analytic continuation to C, and a functional equation 
sending s H > 1 — s. The Rankin-Selberg L-function factors (see [Iwa97, (13.1)]) as 


L(f ® f,s) 


== E 

n>l 


M01! 

1 ) 


||r(sy m2 /. S ). 


where L(sym 2 /, s) is the symmetric square L-function, known [GJ78] to be the au¬ 
tomorphic L-function of a cohomological form on GL(3). 

A shear of the standard Hecke integral (already arising implicitly in classical work 
of Titchmarsch [Tit51, Chap. VII]) is the following calculation: 

[°° f (Ty + iy) = L y s )w t ( s , T), (1.14) 

Jo y 

where 

W fc (s, T) := (2vr)- (s+(fc - 1)/2) T(s + *=±) (1 « ir)- (s+(fc - 1)/2) . (1.15) 

Applying Parseval to (1.14) gives (for s — 1/2 + it) 

[ \f(Ty + iy)\ 2 y k - = -^-[\L(f,l + it)\ 2 \W k (l+it,T)\ 2 dt. (1.16) 

Jo y ^ Jr 

A calculation with Stirling’s formula (or see Figure 2) shows that |V\4(| + it,T)| 2 
has rapid decay as soon as \t\ > T 1+£ , and is of size roughly 1/T in the bulk. Thus 
the quantity on the right side of (1.16) behaves like a smoothed second moment of 
L(f,s) on the critical line. Applying Theorem 1.3 with T = \ f\ 2 y k (and a little more 
work, see §4) gives the following. 
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Figure 2. A sample graph of the smoothed archimedean weight 
|W fe (| + ff,T )| 2 


Theorem 1.17. With notation as above, there is an rj > 0 so that 

^ / \L(f,l + it)\ 2 \m(l + it,T)\ 2 dt (1.18) 

= 2 J^)('°S r + £(«»’/. U+7-*£(2)) + 0,(T -), 

as T —» oo. Here 7 is again Euler’s constant, ||/|| is the Petersson norm, and A '/A 
is the logarithmic derivative of the completed symmetric-square L-function, 

A(sym 2 /, s) = {A7r)- {s+k -^ T(s + k - 1) L(sym 2 f, s). (1.19) 

Remark 1.20. One can chase the various exponents in our proof to see that (1.18) 
holds with r] = 1/14 — e. Again, we are striving for simplicity of the method and 
not optimal exponents, see Remark 1.7. In fact, a straightforward refinement of the 
proof of Proposition 2.16 (using explicit spectral expansions in place of soft ergodic 
arguments) gives 7 = 16/39 — £ on quoting the best-known bounds [KS03] towards 
the Ramanujan conjectures, and 7 = 1/2 — £ conditionally. So in a sense, the proof 
of Theorem 1.17 is “sharp,” as there is no “loss” in the rate from a best-possible one. 


Remark 1.21. On comparing the lower order terms on the right hand side of (1.18) 
with the secondary term in (1.4), and using (1.10) and (1.11), one derives a Kronecker- 
type limit formula, in the form: 


(log(%lh(-)l 4 ), \f\ 2 y k ) 

ll/ll 2 


7 


j(sym 2 /,l). 


( 1 . 22 ) 


This identity is surely classical, though we were not able to locate a precise reference. 
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1.3. Application 2: Archimedean Counting for Orbits on Affine Quadrics. 

Another standard context where integrals like ht( \P) arise naturally is in the ex¬ 
ecution of certain Margulis/Duke-Rudnick-Sarnak/Eskin-McMullen type arguments 
[Mar04, DRS93, EM93] for counting discrete orbits on quadrics in archimedean balls. 
The setting is as follows. 

Let Q be a real ternary indefinite quadratic form (e.g., Q(x) = x 2 + y 2 — z 2 ), fix 
d G M, and denote by V = Vq ^ the affine quadric 

V : Q — d. (1.23) 

The real points R(M) enjoy an action by G = SOq(M), the connected component 
of the identity in the real special orthogonal group preserving Q. Let T < G be a 
discrete, Zariski dense, geometrically finite subgroup of G, and assume, as throughout, 
that the critical exponent 5 of T exceeds 1/2. Fix a base point x 0 6 1/(M), subject to 
the orbit 

O : = x 0 -T C M 3 

being discrete. 

For a fixed archimedean norm || • || on M 3 , let 

B t = (x 6 I 3 : ||x|| < T} 

be the norm ball of radius T. A very classical and well-studied problem is to give an 
effective (that is, with power savings error) estimate for 

Mo(T) := |O n B t |. 

Despite the vast attention this problem has received over the years, there remained 
exactly two lacunary cases in which hitherto resisted solution; see §1.3.1 and Table 1 
below for a detailed taxonomy of the situation. Equipped with Theorem 1.3, we can 
now resolve the outstanding cases. 


Theorem 1.24. There exist constants Ci,C 2 , and rj > 0 so that the following holds: 
• If T is a lattice in G, then 


Afo{T) = (C'\T log T + C 2 T) (l + 0(T~ V )). 

(1.25) 

• If T is thin, then 


Ar o {T) = {CfT + C 2 T 5 ){l + 0(T^)). 

(1.26) 

Some comments are in order: 


Remark 1.27. 


(1) All of the previously resolved cases of this problem (in the above generality) 
were such that the first term did not appear, that is, C\ = 0 (whence C 2 > 0). 
Our new contributions are to the cases with C\ > 0, which arise exactly when 
the stabilizer of x 0 in G is diagonalizable but “cuspidal” in T\G; see §1.3.1 
below. 


DUBI KELMER AND ALEX KONTOROVICH 


(2) If it happens that O is not just some arbitrary real discrete orbit but is 
actually the full integer quadric V (Z) (assuming of course that the quadratic 
form Q is rational and that V(Z) is non-empty), then one has many more 
tools available to approach the counting problem for A/"o(T). Specifically, one 
can use, e.g., classical methods of exponential sums (see [Hoo63]), or half¬ 
integral weight automorphic forms, Poincare series and shifted convolutions 
[Sar84, BI 0 O 8 , TT13], or multiple Dirichlet series [HKKL13]. These give, when 
T is a lattice and C\ > 0, an estimate for J\f 0 {T) of the same strength as (1.25), 
that is, with a secondary main term and a power savings error. Such tools do 
not seem to apply to the general orbit counting problem. 

(3) In the thin case with C\ > 0, the second term C{T 6 is swamped by the error, 
and should not be confused with a lower order “main” term. We would like 
to acknowledge here that Nimish Shah suggested to us that the main term in 
this setting is of order T rather than T 5 ; see also [OS13, Remark 1.7]. 

(4) For the “new” cases with C\ > 0, the exponent 77 depends on the same quan¬ 
tities as in Theorem 1.3; that is, 77 depends on the spectral gap in the lattice 
case, and only on the critical exponent in the thin case. 

(5) The constants C\ , C ' 2 can be readily determined explicitly in terms of vol¬ 
umes, special values of (possibly regularized) Eisenstein series, and Patterson- 
Sullivan measures. 

( 6 ) A consequence of Oh-Shah’s result discussed in Remark 1.12 gives, in the 
lattice case, the estimate (1.25), but with 0(T _r? ) replaced by the weaker 
error rate 0(l/(logT) v ) for some small 77 > 0. This of course only identifies 
the first main term CiTTogT, as the secondary term C' 2 T is swallowed by the 
error. 

1.3.1. Taxonomy. 

To explain the lacunary cases settled by Theorem 1.24, we begin by passing from 
SOq(M) to its spin cover PSL 2 (R) = T 1 (EI). Abusing notation, we continue to write 
G and T for their pre-images in PSL 2 (R). 

Let 77 be the stabilizer of xo in G, 

77 \= {h G G : x 0 ■ h = x 0 }, 

and let 

r ff :=rn 77. 

With x 0 fixed, the norm || • || on R 3 induces a left-77 invariant norm |||'||| on G given 
by 

Nil = IMII- (1-28) 

We further abuse notation, writing Bt for the left-77-invariant norrn-T ball in G, that 
is, Bt C H\G. Then it is easy to see that 

A fo{T) = \T h \T n Bt\. 
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(a) Case H =* K. (b) Case H = N. (c) Case H = A. 


FIGURE 3. The region Bt as a subset of the hyperbolic disk D. 

To investigate this counting problem more precisely, we illustrate the geometry of 
Bt, which is determined by whether the stabilizer H is conjugate to the groups K, N, 
or A. That is, H is either a maximal compact, a unipotent subgroup, or diagonalizable 
(over M), and this corresponds to whether the real quadric C(M) is a two-sheeted 
hyperboloid, a cone, or a one-sheeted hyperboloid, respectively. To visualize B T as a 
left-//-invariant subset of G, we project to the base space El (or alternatively, assume 
that the norm |||'||| is right-/!-invariant), so that Bt can be viewed as an //-invariant 
subset of the hyperbolic disk D = G/K. Then Bt is illustrated in Figure 3 in the 
three cases. Note that Bt has zero, one, or two limit points (denoted £ or £±) on the 
boundary <9D, corresponding to whether H = K, H = N, or H = A, respectively. 

The asymptotic counting analysis depends in a fundamental way not only on 
whether T is a lattice in G, but also on whether 

Th is a lattice in H. (1-29) 

Lemma 1.30. If (1.29) does not hold, then the discreteness of O is equivalent to the 
endpoints £ or £± of H not being radial limit points 6 for T. 

This follows from a simple topological argument; we omit the proof. We decompose 
the analysis according to whether T is a lattice or thin in G. 

Case I: r is a lattice. 

Assuming that T is a lattice in G, and also demanding that (1.29) holds, Duke- 
Rudnick-Sarnak [DRS93] and Eskin-McMullen [EM93] (see also [Mar04]) showed (in 
much greater generality than considered here) that there is some 77 > 0 with 

Vo(T) = vol H , G (B T ) (l + 0(T-*)) , (1.31) 

6 Recall that the limit set, A, of F decomposes disjointly into cusps (i.e., parabolic fixed points) 
and radial limit points (also called “points of approximation”); the complement cM\ A is called the 
free boundary (which is empty if T is a lattice). See §2.1. 
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as T -)• oo. Here the volumes are taken to be compatible with choices of Haar measure 
on G, H , and H\G. Note that vol h\g{Bt) is of order T, so there is no logarithmic 
divergence in (1.31), that is, C\ — 0 and C 2 > 0 in (1.25); see also Remark 1.27(1). 

With Figure 3 and Lemma 1.30 in mind, we analyze separately the possible conju- 
gacy classes of H. 

• First if H = K is compact, then (1.29) clearly holds automatically. In this 
case, the counting result (1.31) corresponds to counting in norm balls of G, 
which dates back to Delsarte [Del42], Huber [Hub56], and Selberg [Sel56]. 

• If H = N is unipotent, then by Lemma 1.30, the boundary point £ of H 
must be a cusp. That is, T h\H is a closed horocycle, so Th is a lattice in 
H, and (1.29) is again automatically satisfied. In this case, the count takes 
place in a strip T H \G, and the equidistribution of low-lying closed horocycles 
[Mar04, Zag81, Sar81] can be used to establish (1.31). 

• Lastly, if H = A is diagonalizable (over M), then Lemma 1.30 forces one of 
two settings. Either 

(i) : T H is a lattice in H, whence Th\H corresponds to a closed geodesic 
on r\G. Then (1.29) holds, so (1.31) follows from [DRS93]. Or 

(ii) : Th is finite, but both limit points £ + and £_ of H (see Figure 3c) are 
cusps of T. Here we are in the diagonalizable but “cuspidal” setting referred 
to in Remark 1.27(1). This case is the only one (when T is a lattice in G) not 
satisfying (1.29) despite the discreteness of the orbit O ; it is precisely the new 
case settled by Theorem 1.24. 

Case II: Y is thin. 

In this setting, we again decompose the problem of estimating A/"o(T) into separate 
cases, depending on the conjugacy class of H, and on whether condition (1.29) holds 
(there are now more situations in which O is discrete but (1.29) can fail). 

• When H = K is compact, the condition (1.29) is again automatically satisfied, 
and in this case Lax-Phillips [LP82] showed that 



(1.32) 


where 77 > 0 depends on the spectral gap for T. This corresponds to the case 
C\ = 0 in (1.26); again, see Remark 1.27(1). 

• If H = N is unipotent, the discreteness of O forces one of two cases. Either 

(i) T h is a lattice in H , so (1.29) holds, and Th\H corresponds to a 
closed horocycle. I 11 this case, the asymptotic formula is (1.32) was shown in 
the second-named author’s thesis [Kon09]. Or 

(ii) Th is trivial, whence Lemma 1.30 forces the limit point £ of H to be 
in the free boundary, that is, it is not in the limit set of T. The asymptotic 
here is also of the form (1.32); see [K012], 
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• Finally, when H = A is diagonalizable, there are three separate cases to 
consider. The discreteness of O now implies either 

(i) Th is a lattice in H , again corresponding to a closed geodesic on T\G'. 
Then the same asymptotic (1.32) follows from now-standard methods using 
the equidistribution result of Bourgain-Kontorovich-Sarnak [BKS10]. Or 

(ii) V H is thin in //, in which case each of the two endpoints of H is 
either a cusp of T or in the free boundary. If 

(a) both endpoints are in the free boundary, then the methods of 
[BKS10] can again be used to show the same asymptotic (1.32). (The key is 
that only a finite portion of the sheared geodesic ray interacts with the limit 
set - see [K012] where a similar phenomenon was studied in the case of a 
unipotent stabilizer.) Otherwise, 

(b) at least one of is a cusp of T. This is the other new lacunary case 
of Theorem 1.24, and is the only thin case for which (1.26) has C\ > 0. Note 
that if one boundary point is a cusp and the other is in the free boundary, the 
former has contribution of order T, while the latter’s contribution, of order 
T 5 , is dominated by the former’s error term. 

This concludes our taxonomy. To summarize, the following table serves to illustrate 
that Theorem 1.24 above completes the effective solution to the general counting 
orbital problem in our context: 



N 

A 

K 

(lattice, lattice) 

Zag81, Sar81 

DRS93, EM93] 

Del42, Hub56, Sel56] 

(lattice, thin) 

impossible by 
discreteness of O 

“lacunary” case 
settled in (1.25) 

impossible by 
compactness of K 

(thin, lattice) 

[Kon09] 

[BKS10] 

[LP82 

(thin, thin) 

[K012] 

1 

' [BKS10], if both </ A, 

“lacunary” case 

. , otherwise, 

settled in (1.26), 

impossible by 
compactness of K 


TABLE 1. The new cases of Theorem 1.24, highlighted, are those with 
C\ > 0. 


Remark 1.33. When the critical exponent 5 < 1/2, work of Naud [Nau05], extending 
Dolgopyat’s methods [Dol98], allows one to conclude, in the cases not excluded by 
Lemma 1.30, an effective asymptotic of the form (1.32). So the lacunary cases do not 
occur here, as there are no cusps (and hence no cuspidal geodesic rays) when 5 < 1/2. 

Remark 1.34. As pointed out in [OS14, p. 917] (at least for T a lattice), the only 
lacunary cases in the more general setting of Q having signature (n, m ) are precisely 
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those of signature ( 2 , 1 ), that is, those considered here; so we have lost no generality in 
restricting to PSL 2 (M). This is because the stabilizer H is either unipotent, compact, 
or fixes a form of signature (n— 1, m) or ( n , m — 1). The only non-compact such not 
generated by unipotents has signature (1,1), whence Q has signature ( 2 , 1 ). 

1.4. Surprise: Non-equidistribution in Congruence Cosets! 

The most unexpected consequence of Theorem 1.24 comes from studying the thin 
case in cosets of congruence towers, as we now describe. Assume that Q is not just 
a real quadratic form but an integral one, and that T is a subgroup of the integral 
special orthogonal group 

T < SO Q (Z). 

Given an integer q > 1, we can then form the level-g “congruence” subgroups T(g) < 
T, defined as 

T(q) := ker(T -A SO 0 (Z/gZ)). 

For many applications, one wishes to study the same counting problem as above, with 
the orbit O = x 0 ■ T replaced by the congruence orbit x 0 ■ T(g), or better yet, by some 
“congruence coset” orbit, 

O qiW := x 0 ■ zu r (q), 

for a given w G T/T(g). Let 

A r q ,tu(T) := \O q ,w Fl B t | 

be the corresponding counting function, which we wish to estimate uniformly with q 
and T (and w) varying in some allowable range. 

Theorem 1.24 applies just as well to estimate Af q ^(T), and in all previously studied 
examples, the asymptotic analysis showed that 

KA T ) ~ |r : j^r Afe(T), (T-> oo), (1.35) 

that is, the asymptotic is independent of w, so the orbits are equidistributed among 
congruence cosets. (Moreover, (1.35) even holds with q allowed to grow sufficiently 
slowly with T .) This equidistribution is a key input, for example, in executing an 
Affine Sieve in an archimedean ordering (see, e.g., [Kon09, NS10, LS10, BGS11, K012, 
M013, BK14, HK14]). An analysis of Theorem 1.24 shows that, for thin orbits, there 
are cosets which are not uniformly distributed in archimedean balls! 

Proposition 1.36. Assume that T is thin, and that the orbit O is has diagonalizable 
and cuspidal stabilizer, that is, C\ > 0 in (1.26). Then the equidistribution (1.35) in 
congruence cosets is false. For example, for each fixed q, there is some w G T/T(g) 
so that 

A fq im (T) > - T, 
q 


(1.37) 
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while 


-MT) < 


— T 

Q M 1 

q 6 


[T: r( 9 )]' 

as T —* oo. (The implied constants above may depend on T and x 0 . but obviously not 
on q or T.) 


This means that the standard Affine Sieve procedure cannot be executed in this 
ordering. (Note that the case considered in Proposition 1.36 is precisely the one 
omitted in the sieving statement [M013, Cor 1.19].) 


1.5. Outline of the Proofs and Paper. 

Our proof of Theorem 1.3 is surprisingly simple, and proceeds in two stages. The 
first is to show that /Tr(’P) in some sense equidistributes in the “strip Too\1HI” , but 
with respect to dxdy/y , as opposed to Haar measure dxdy/y 2 (for a hint of this, 
look again at Figure 1); this fact uses only the decay of the Fourier coefficients of T 
(itself a simple consequence of mixing via low-lying horocycles; see Proposition 2.16). 
Then stage two is to relate this equidistribution to Eisenstein series, where we mimic 
Sarnak’s approach in [Sar81, Zag81] to conclude the proof. 

The rest of the paper proceeds as follows. In §2, we set notation and recall basic 
facts needed through the paper. Then in §3, we prove the main equidistribution 
result, Theorem 1.3, and its generalization (Theorem 3.2). Then Theorem 1.17 is 
proved in §4 using the Rankin-Selberg “unfolding” technique. Finally, Theorem 1.24 
and Proposition 1.36 are proved in §5. 


1.6. Notation. 

Constants 0 < C < oo and 0 < y < 1 can change from line to line, and e > 0 
represents an arbitrarily small quantity. The transpose of a matrix g is written T g. 
Unless otherwise specified, implied constants depend at most on T, which is treated 
as fixed. The symbol lj.j represents the indicator function of the event {•}. 
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2. Preliminaries 

In this section, we set all notation and basic facts used throughout. 

2.1. Hyperbolic Geometry. 

Let El := {z G C : 3mz > 0} denote the hyperbolic upper half-plane. At each 
point zeB, and tangent vector £ G T Z M = C, the Riemannian structure is ||£|| z : = 
|£|/3rru. The unit tangent bundle T 1 !! is then 

T 1 H := {(z,£) G H X C ?||£|| z = !}■ 

The group G = PSL 2 (M) acts on T 1 EI via 

( a b\ faz + b £ \ 

V C d ) ■ [ cz + ( f (cz + d)*)’ 

and moreover we can identify G = T 1 !! under g t —y We also use the disk 

model D := {z G C : |z| < 1}, identified with El under the map 

El 9 z i-A (z — i)/(z+ i) E D. 

Let T be a finitely generated, Zariski dense, discrete subgroup of G. As above, we 
identify T 1 (T\EI) = T\G. For a fixed base point o € El, the critical exponent 

5 = 6(r) e [ 0 , 1 ] 

of T is the abscissa of convergence of the Poincare series 

^exp(-sd( 70 ,o)), (9%c(s) > 5). 

7 er 

Here d(-, ■) is hyperbolic distance, and 5 does not depend on the choice of o. Let dg 
be a choice of Haar measure on G: we call L a lattice if T\G has finite measure, and 
thin otherwise. This is measured by the critical exponent S, as 6 = 1 or <5 < 1 exactly 
when T is a lattice or thin, respectively [Pat761; the Zariski-densit.y of T implies that 
5 > 0. The limit set 

A = A(T) c <9EI = S ' 1 = M U (ex)} 

of T is the set of limit points of 70 , 7 G T; it also does not depend on the choice of 0 . 
The Hausdorff dimension of A is exactly equal to the critical exponent 6 [Pat76, Sul84], 
A boundary point £ e <9H is a cusp of T if it is the fixed point of a parabolic element 
in r; these all lie in the limit set A, and we let A. cusp denote the subset of cusps. 
A limit point £ G A is called radial (or a “point of approximation”) if there is a 
sequence { 7 ^ 0 }, 7 n G T, which stays a bounded distance away from a geodesic ray 
ending at £. Let A ra d denote the set of radial limit points; it is a basic fact [Bea83] 
that the limit set decomposes disjointly into radial and cuspidal points, 

A A cusp LI A rac ;. 
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The complement of the limit set in cffl is called the free boundary of T, 

7 = 7{T) := dW\A, 

and T = 0 if and only if T is a lattice. We record here the decomposition 

dm = T U A cusp U A rad . (2.1) 

We assume henceforth that T has at least one cusp, whence its critical exponent 
exceeds 1/2, 

5 > 1 / 2 . ( 2 . 2 ) 


2.2. Spectral Theory. 

The hyperbolic Laplace operator A := —y 2 (d xx + d yy ) acts (after unique extension) 
on the space L 2 (T\EI) of square-integrable automorphic functions, and is self-adjoint 
and positive semi-definite. Let 0 = O(T) C [0, oo) denote the spectrum of A. The 
assumption that T has at least one cusp implies the existence of continuous spectrum 
above 1/4, that is, [1/4, oo) C O (there may also be embedded discrete spectrum 
in this range, which only occurs when T is a lattice [Sel56, Pat75]). Below 1/4 the 
spectrum is finite [LP82] and nonempty (by (2.2)); we denote these eigenvalues, often 
referred to as the “exceptional spectrum,” by 

0 A Ao Ai A ' ' ' A A max A 

and introduce spectral parameters 1/2 < s y < 1 defined by 

= — s j)j 


so that 


1 


1 

A ^ A ■ ■ ■ A s ma x '' . 2' 


(2.3) 


The bottom eigenvalue Ao is simple, and is related to the geometry of A via the 
Patterson-Sullivan formula [Pat 76, Sul84] 


Ao = 5(1-5), 


that is, so = 6. 


N ■= 


A := {diag(a, 1/a) : a > 0} , K := SO(2), 


2.3. Algebra. 

We will use standard notation for the subgroups N, A, and K of G, given by: 

1 1 
0 1 

and containing typical elements 

l x\ , = ( 

1 /y/V 


(2.4) 


n x := 


a y 


kn = 


cos 6 sin 9 
- sin 9 cos 9 


As right actions, they correspond, respectively, to the unipotent flow, geodesic flow, 
and rotation of the tangent vector, keeping the base point fixed. On the other hand, 
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as left actions, the correspond, respectively, to horizontal translation, scaling, and 
motion around a hyperbolic circle centered at i. Haar measure dg in Iwasawa co¬ 
ordinates g = 7i x a y kg is then given by dg = dxy~ 2 dyd6. The right-action by the 
semigroup A + := {a y : y > 1} corresponds to the positive geodesic flow, so that a 
given point xo G G = T 1 !! gives rise to the geodesic ray xoA + . 

2.4. Representation Theory. 

By the Duality Theorem [GGPS66], the spectral decomposition (2.3) corresponds 
to the decomposition of the right regular representation of G on L 2 (T\G) as 

L 2 (T\G) = Vo © 1 1 © • • • © V max © Vtemp- 

ffere V tem p consists of the tempered spectrum (a reducible subspace); each Vj, j = 
1 ,...,max is an irreducible complementary series representation of parameter Sj ; 
and Vo is either the trivial representation (if T is a lattice), or a complementary series 
representation of parameter So = 5 (if T is thin). 

We record here a Sobolev-norm version [BR98] of the exponential decay of matrix 
coefficients. Fix a basis SS = {Xi, X 2 , X 3 } for the Lie algebra g of G, and given a 
smooth test function T G C°°{T\G), define the “L p , order-d” Sobolev norm <S P)d (\l/) 
as 

S P ,d{^) ■= ^2 II^IU p (r\G)- 

ord (@)<d 

ffere @ ranges over monomials in SS of order at most d. 

Theorem 2.5 ([Cffff88, ShaOO, VenlO]). Let (tt, V) be a unitary G-representation, 
and assume there is a number 0 > 1/2 so that V does not weakly contain any com¬ 
plementary series with parameter s > 0. Then for all smooth v,w G V°° , we have 

\(n(g).v,w)\ < ||^|| 2(1_e) 52,i(n)5 2 ,i(w), (2.6) 

as \\g\\ 2 := tr (g T g) —» oo. The implied constant is absolute. 

Later we will encounter other Sobolev norms which are convex combinations of 
those above. 

2.5. Uniform Spectral Gaps. 

Recalling the spectral decomposition (2.3), we call a number 0 G (1/2,5) a spec¬ 
tral gap for T if 0 > si. To make sense of a uniform such gap, we assume integrality. 
As in §1.4, if T consists of integer matrices, T < PSL 2 (Z), we may, given an integer 
q > 1, define the level-g “congruence” subgroup 

T(q) := ker(r —► PSL 2 (Z/gZ)). 

Let Ll(q) be the spectrum of A on L 2 (r(g)\EI); clearly D C Ll(q), and in general this 
inclusion is strict. We will call a number 0 G (1/2,5) a uniform spectral gap for 
T if, for all q > 1, 


0(g) fl (5(1 — 5), 0(1 — 0)] = 0, 


(2.7) 
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that is, there are no eigenvalues at any level q in a neighborhood of the base eigenvalue 
Ao = 5(1 — 5). (Note that this definition is different from other related definitions in 
the literature.) In a number of statements below, several quantities depend on the 
spectral gap in the lattice case, but only on the critical exponent in the thin case; to 
unify the two notions, we will say that such quantities depend on the first non-zero 
eigenvalue of T. 


2.6. Eisenstein Series. 

In this section we recall some basic facts from the theory of Eisenstein series. We 
will assume here that the Eisenstein series is with respect to a cusp at oo and note that 
Eisenstein series corresponding to other cusps are defined similarly, after conjugating 
that cusp to oo. In our applications, we will not have the flexibility to demand the 
cusp have width 1, so deal below with arbitrary width. 

The Eisenstein series corresponding to a cusp at oo of width oj > 0 is defined in 
the half-plane 91c(s) > 1 by the convergent series 

E(z,s) := - (2-8) 

^erAcYT 

where T^ = (J “f). 

Assume first that T is a lattice. Then E(z, s ) has a meromorphic continuation to C 
with a functional equation sending s t —> 1 — s. In fact, it is analytic in the half plane 
91c(s) >1/2 except for a simple pole at s = 1 and perhaps finitely many poles 

1 > 0 i > ■■■ > cr h > 1/2. (2.9) 


These poles comprise the “residual spectrum,” which is a subset of the “exceptional 
spectrum” in (2.3); the remaining spectrum in this range, if any, is cuspidal. The 
residue at s = 1 is 


Res s= i E(s, z ) 


1 

vol(T\lffi) 


and 


VcTjiz) = Res s=(Tj E(s,z) 

are the residual forms. 

For any integer n G Z we also define the weight 2 n Eisenstein series by 


E n (z,s) := - V 3m(7 z) s e 7 (z) 2n , 

LU L ' 

7Gr,o\r 


( 2 . 10 ) 


where 


e g (z) 


cz + d 
cz + d\ ’ 



Unless the weight n — 0, the E n ’s are all regular at s = 1, that is, Eq = E is the only 
Eisenstein series with a pole at s = 1. In the range \ < 91c(s) < 1, the poles of E n 
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are the same s — a ±,..., ah as those of E. For each such pole a = Oj we denote by 

T<y,n Rgs s=(T E n (z, s), (2.11) 

the (un-normalized) residual form of weight ‘In. 

We note for future reference that the weight-2n Eisenstein series and the weight-2n 
residual forms all lie in the space (r, 2 n) of functions on El transforming by 

/( 7*0 = e 7 ( z ) 2n f( z )- ( 2 - 12 ) 

Still assuming that T is a lattice, we also have the following bounds coming from 
the spectral decomposition of L 2 (r\G), see, e.g. [Iwa97]. For any square-integrable 
/ £ (r, 2 n), we have the bound 

[ \(f,E n (-,l + ir))\ 2 dr <\\f\\ 2 2 . (2.13) 

/7r Jr 

When T is thin, we will only use the fact that the defining series in (2.8) and (2.10) 
converge absolutely in the range 91c(s) > S. 

2.7. Decay of Fourier Coefficients. 

In this subsection we wish to record the basic fact that the (parabolic) Fourier 
coefficients of an automorphic function decay in the cusp, in a uniform sense. The 
method we use to establish this is completely standard, though the requisite unifor¬ 
mity does not seem to be in the literature; hence we give sketches of proofs for the 
reader’s convenience. We again assume that V has a cusp at oo of width oj > 0, that 
is, the isotropy group Too of oo is generated by the translation z i —> z + uj. 

Then a smooth, square-integrable, r-automorphic function T £ L 2 nC' 00 (T 1 (r\]HI)) 
has a Fourier expansion: 

V(x + iy, C) = ^2 a*(m\yX) e w (mx), (2.14) 

where e^x) := e 2mx ^ u : and the Fourier coefficients are given by 

i r 

a^(m-,y,Q ■=— T(x + iy, ()e u (—mx)dx. (2.15) 

w Jo 

The next proposition records the decay of such as y —> 0 (in a uniform statement); 
the subscripts F below stand for “Fourier coefficients.” 

Proposition 2.16. There is a “Sobolev” norm <Sp(\l/) and constants 0 < Cp < oo 
and 0 < ap < 1, so that, uniformly over all y > 0 and m £ Z \ {0}, we have 

\a 9 (m;y,C)\ < S F (T) \m\ CF y ap . (2.17) 

The constants ap and Cp depend on the first non-zero eigenvalue ofT. 
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The Sobolev norm and exact values of the constants Cf and oif are given below 
in (2.23), (2.24), and (2.25), respectively; the last claim of the proposition is then 
clear, namely, that the constants depend on the spectral gap when T is a lattice, 
and only on the critical exponent when T is thin. As we are not striving for optimal 
exponents (recall Remark 1.7), we have chosen to suppress their precise values so as 
not to clutter the presentation. Recall also our convention that implied constants 
may depend at most on T, unless stated otherwise. 

The Proposition is an easy consequence of another standard fact, namely the 
equidistribution of (pieces of) “low-lying” horocycles; the subscripts H below stand 
for “Horocycle pieces.” 

Proposition 2.18. There is a “Sobolev” norm Sh(^) and constants Ch < oo and 
0 < an < 1, so that, uniformly over all y > 0 and open intervals Z C (0,cu), we have 

Ch y a A. (2.19) 

This statement holds whether T is a lattice or not, with the interpretation that the 
first term on the right-hand-side of (2.19) vanishes in the thin case. The constants 
Ch and an depend on the first non-zero eigenvalue ofT. 

Again, the norm and constants are detailed in (2.20), (2.21), and (2.22), which we 
have suppressed in the interest of exposition. Much stronger versions of (2.19) exist 
in the literature (at least in the lattice case, for which see, e.g., [Str04, SU15]), but 
for the reader’s convenience, we provide a quick 

Proof of Proposition 2.18 [Sketch]. 

We may assume that y < 1 (for the statement is obviously true otherwise), and we 
may moreover assume that T is right-A'-invariant (or replace T by T := 7r(/c^ 1 )T, 
where 29 is the angle of ( measured counterclockwise from the vertical). Then the 
left hand side of (2.19) is 

1 [ 

./M := — / 4 ’{n x a y )dx. 

\M Jx€I 

Let p be a smooth, non-negative function on M with support in [—1,1] and f R p = 1. 
For g > 0 to be chosen later, concentrate p to p v (x) := rR 1 p(x/g). Write N := T N, 
n x := T n x for the opposite horocyclic group and element. Then the multiplication 
map 

N x A x N —>■ G : (n, a, n) i—^ nan 

is bijective on an open neighborhood of the origin. Define f : G —>• M>o, supported in 
such a neighborhood, via: 

i(n x a t n s ) := c ? • (p v * (x) ■ p v (logt) ■ p v (s), 


— / fy(x + iy, ()dx = 


vol(r\G) Jr\c 


T dg + 0[S H (y) \1\ 
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where a denotes convolution, and x 1 is a constant (independent of rj) chosen so 
that f G £ = 1. Automorphize £ to 

z(g) ■= 

7er 

which is a function on r\G with = 1. Finally, consider the matrix coefficient: 

^ : = (7r(a y ).'h,S), 

which we evaluate in two ways. Using the decay of matrix coefficients (2.6), we see 
immediately that 

r= ^m/ IV , v+0 ( vl ' e ^ v ^)- 

where 0 = si + e is a spectral gap in the lattice case, and 0 = 5 + e in the thin case 
(in which case the “main” term vanishes). It is easy to estimate, crudely, that 

•ms) « m-v 3 . 

For a second evaluation of If, unfold the inner product to obtain 


# = 


'NAN 


ty(n x atn s ay)£(n x a t n s )dn s datdn x 


The “wavefront lemma” (in this case, trivial) states that n s a y = a y n sy , and we esti¬ 
mate 


^(n x a y a t n sy ) = T( n x a y ) + 0[r) 5^, i(’F) 


Hence 

# = J ^(n x a t ) ^pn* dx + O^rj 5 0O) i(^)^ = Jt + O (q 5 0O) i(^)^ . 

Combining the errors and choosing q = y( 1_0 )/ 4 (S 2)1 (\l/) 1//4 t S 00i i('I / )~ 1/,4 |Z|~ 1 / 4 gives 
(2.19) with 

ShW ■= ( 2 . 20 ) 


and 


as claimed. 


C H ■= 1/4, 

1-0 
an 


4 ’ 


( 2 . 21 ) 

( 2 . 22 ) 

□ 


Equipped with Proposition 2.18, we may now give a quick 
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Proof of Proposition 2.16. 

Again we may assume that T is right- K -invariant and y < 1. Let J > 1 be an 
integer parameter to be chosen later, and write 

^ J- 1 rw(j+l)/J 

aq,(m;y) = — / \1 /(x + iy)e ul (—mx)dx. 

u j =o Juj/J 

On each short interval, we estimate e^{—rnx) = e(—mj/J) + 0(\m\/J), whence 

f~\ r u} (j+i)/J 

av{m-,y) = Y]e(-mj/J)- / ^(x + iy)dx + 0(||'P|| 0O |m|/J). 

j =0 U Juj/J 

Now on each little integral, we apply the equidistribution of pieces of “low-lying” 
horocycles in the form (2.19), that is, 

~ ^(x + iy)dx = - / ^ dg+ 0\S H {^) y aH J c ’ H ~ l 

u Juj/j J voi(r\G) J r \ G y 

Inserting this expression into aq,{irr,y) and using m ^ 0, the roots of unity cancel 
out, leaving only error terms: 

\ay(m;y)\ <C S H (^) y aH J° H + \\^\\oo\m\ . 


Setting 


J x y~ a » Halloo H) 1/(Ch+1) , 


we arrive at (2.17) with 


S F (V) := S H ^) inCH+1) ||^||S f/(C ' if+1) , 

(2.23) 

C F ■— C H /(C H + 1), 

(2.24) 

and 


OiF := an/(C h + 1)- 

(2.25) 

This completes the proof. 

□ 


Remark 2.26. It should be noted that actually Propositions 2.18 and 2.16 are equiv¬ 
alent, in the sense that one can also use the uniform decay of Fourier coefficients to 
prove a version of (2.18) (though with possibly worse exponents). 

Remark 2.27. In the thin case, the proof of Proposition 2.16 can be made much 
simpler. Namely, one can first trivially bound the m th coefficient by the constant 
one, \ay(m;y, £)| < |a^(0; y, C)|, and then use (2.19) with X = (0 ,cj) to estimate the 
constant coefficient. (Note though that if P is a lattice, then of course a^(0; y, () need 
not decay!) 




22 


DUBI KELMER AND ALEX KONTOROVICH 


3. Equidistribution of Shears 


Recall our running assumption that T < G = PSL 2 (M) is a geometrically finite, 
Zariski dense, discrete group with at least one cusp, and hence critical exponent 6 
exceeding 1/2. As in (1.2), we will study the limit as \T\ —> oo of the measures 



To study the equidistribution of such, we need an appropriate space of test func¬ 
tions; in particular, we will require smoothness and at least polynomial decay at the 
cusp. Toward this end, for any cusp a of T and integer m > 1, we introduce the space 


^ a m (r\G) c L 2 nc°°(r\G) 


of smooth, square-integrable, automorphic functions with the following added prop¬ 
erty. We will state it in the case a = oo; for a general cusp a, conjugate a to oo in 
the standard way. 

We require that, for each T e £?™(T\G), there are constants 1 < Cq, < oo and 
0 < ctiji, such that 



(3.1) 


holds uniformly for all j < m, y > Cq,, and all n € N, k E K . That is, after a certain 
point high up in the specified cusp, we have completely uniform polynomial decay in 
T its first m derivatives in t = Lie(A'). Note that we make no demands on decay 
properties (beyond square-integrability) in any other non-compact regions (cusps or 
possibly flares) of T\G besides the specified cusp a. Also note that the space 
is non-empty, since, e.g., it contains the subspace of smooth, compactly supported 
functions, or better yet, cusp forms. 

Our main theorem, from which Theorem 1.3 follows immediately, is the following. 

Theorem 3.2. Let xo • A + be a cuspidal geodesic ray ending in a cusp a ofT, and let 
T G &%(T\G) be a test function (i.e., assume (3.1) is satisfied for all j < 2). Then 
there is a finite-order “Sobolev” norm <S(T) (which depends on the constants Cq, and 
a q, in (3.1)/, and an g > 0 depending only on the first non-zero eigenvalue ofT, so 
that: ifT is a lattice, 


m*) = login ^om+^g-w+o^wT-”), 


and ifT is thin, then 


M*) = l*Eis{*) + 0{S(V)T-*), 


as \T\ —> oo. Here yUr\ g('J / ) := vol(T\G) 1 f r \ G ^ is the Haar probability measure, 
h’Eis “ re 9 u l ar i ze d Eisensteiri” distribution given in (3.40), and / iEi S is the dis¬ 

tribution given in (3.33). 
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As a first simplification, we can immediately apply an auxiliary conjugation to 
move x 0 to the origin e = (i, j'), whence the cusp a moves to oo. Unfortunately, 
we have thus exhausted our free parameters, and cannot control the width of the 
resulting cusp, which we denote c a; that is, the isotropy group Too is generated by the 
translation ziAi+u. 

As outlined in the introduction, the proof of Theorem 3.2 now proceeds in two 
stages, as encapsulated in the following two theorems. 


Theorem 3.3 (Equidistribution in the “strip” © = T^G). For a test function 
T G ^^(r\G), define the measure: 




vol(r oo \A’) „ 

'neTooVA Ja£A+ 


T( naa\/T)dadn = — 


L0 


/ w(: n x a y )— dx. 
h/r V 

(3.4) 


(Recall that in this context, da y is dy/y, not dy/y 2 .) Then there is a “Sobolev” norm 
(SeAb) and a constant a& > 0, defined in (3.26) and (3.27), respectively, so that 


/irW = fPr, 6 {*) + 0^ 6 (^)T-“^, (3.5) 

as \T\ —)• oo. Here only depends on the first non-zero eigenvalue ofT. 

Note that Theorem 3.3 makes no distinction between whether T is a lattice or thin. 
This dichotomy is only evident in the second stage: 

Theorem 3.6 (Eisenstein distributions). Let T G ^^(T\G) as above. 

Lattice Case: If T is a lattice in G, then there is a distribution defined in 
(3.40), and “residual” distributions y aj corresponding to (2.9) and defined in (3.42), 
so that: 

hTeW = M^log cn+zw^o 

j =1 aj 

Thin Case: If V is thin in G, then there is a distribution yEis defined in (3.33) so 
that: 

/kr.eW = hEism + o(s H {*)T~ a * S ). (3.7) 

Here Sh and an are as in Proposition 2.18. 

It is clear that Theorem 3.2 follows immediately from Theorems 3.3 and 3.6. 
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3.1. Stage 1: Proof of Theorem 3.3. 

We proceed with a series of elementary lemmata. Beginning with the definition 
(1.2), we express /It in terms of coordinates in T 1 (r\EI): 

yT 




vp 


+1 


y 




VT 2 + 1 v / T 5 TT J V 


(3.8) 


All of our manipulations below will not affect the direction of the tangent vector, so 
we drop the f. (Alternatively, pretend is right-A'-invariant.) 


Lemma 3.9. With Cy and from (3.1), we let 

U > a v T 

be a parameter to be chosen later in (3.25). Then 

i-2 


rji \ Q'ty 

Pt (+ Ol | v f'|| 00 T 2 + C\ j, I — 


U 


(3.10) 

(3.11) 


where 


^4^1 (— 


;'*(>+4) ?• 


(3.12) 


Proof. From (3.8), make a change of variables y t —> yVT 2 + 1/T, and simplify to 


^t(^) — 


'T/Vri+i 


lTf . , • y\dy 


+ + 0( Il'PllooI” 2 


With U as in (3.10), break the range of integration [1, oo) = [1, U] U (U, oo). On the 
latter range, apply (3.1), whence (3.11) follows. □ 

Now we invoke the Fourier expansion (2.14). Define 

'^ ± (x + iy) : = a\&(m;y) e^^mx), 

mSZ\{ 0 } 

so that 

^(x + iy) = a^(0; y) + '& ± (x + iy). (3.13) 

Inserting (3.13) into (3.12) splits into a “main term” and “error”: 

yrft i = T 

where 


and 


> ==/'<* K)f- 


== F^iv + ig)*. 


T) y 


(3.14) 

(3.15) 


We first analyze 
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Lemma 3.16. Recalling the measure ht, e l n (3.4), we have 


•^2 — Rr,e{^) + 0( ( — 




(3.17) 


Proof. Inserting (2.15) into (3.14) gives 


•-^2 — 


ft/ ^ /-a; 


cu 


.y 


— / 4 / x + i—- Gte— = — 


T 


dy i r f u/T 


dy 


V 


L0 


T {x + iy) —dx. 
o Ji/t V 


Extending the y integral from U/T to oo and applying (3.1) again gives the claimed 
main and error terms in (3.17). □ 

Returning to S\ in (3.15), our next goal is to incorporate the Fourier expansion, 
via the following 


Lemma 3.18. Let 


Then 


u 


^ E^ E 




71=1 mez\{0} 


m 


\£l\ < ^2 + SoopW 


log U 


(3.19) 


Proof. We first straighten out the sheared integral in (3.15) by breaking it into sums: 


^1 ru +1 


« = tj 

u=l Ju 


lTf j. / , -v\ d v 


On each interval, estimate 


^(v + if) = vfv + if) + O (s^) 1 ) , 


and Fourier expand 




Thus 


u-i 


« = EE 


7t=l m^O 


U 


aq, l m; 


m^O 


ru+1 f n dy 
e w {my) — 

7 y. 


+ O (<Soo,i(4/) 


log 17 

T 


Inserting absolute values and estimating the bracketed term by partial integration 
gives (3.19), as claimed. □ 
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Our final task is to estimate S\\ we cannot directly use the decay of Fourier coef¬ 
ficients (2.17) in the full range of m, so introduce a parameter M to be chosen later, 
and decompose 

<^2 = <£> + $<i 

where for □ G {>, <}, 


u 


* E^ E 


It— 1 


0/|m| □ M 



?)l 

\m\ 



We first estimate the large range trivially. 

Lemma 3.20. 

£> < Halloo M- 1/2 . 
Proof. Cauchy-Schwarz and Parseval give: 

1/2 

4 « E^| E." 

\m\>M 


1/2 


u =1 
U 


u 

ay l m\ - 


E 


E 

LA— 1 


1 


< > -T - 


U \ UJ 


f*UJ 

T / . u \ 


T ( x + 1 —\ 

Jo 

V Tv 


\m\>M 

2 \ V 2 

dx ) 


m 


(3.21) 


which can be estimated by (3.21), as claimed. 


□ 


Finally, we estimate the range of small m using decay of Fourier coefficients. Note 
that this is the only part of the argument involving any spectral theory; nevertheless, 
thanks to the uniformity of Proposition 2.16, we do not at this stage perceive any 
difference between the lattice and thin cases. 


Lemma 3.22. Recalling the Sobolev norm Sp and constants Cp and ap from Propo¬ 
sition 2.16, we have 

<f< < S F {%) T~ aF M Cf . (3.23) 


Proof. Applying (2.17) gives 

g _ y^ y^ \a^(m] f) 

Z-^ u 2 1 77J I 

u =1 l<|m|<M 1 




E^ E Sf-WM 0 '- 1 

u=l l<\m\<M 


U 


Otp 


T 


which is bounded as claimed in (3.23). 


□ 


Proof of Theorem 3.3. This is now a simple matter of combining the above lemmata. 
To balance (3.21) and (3.23), set 

m = (H^Hoo 5 f (^)- 1 r F ) 1/( ° F+1/2) , 

for a net error, crudely, of 

4 = 4 + ^ = 0(max(5 F (^),cS 0O)1 (^))-T-“^ 2 ^+ 1 )). (3.24) 
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To balance the error in (3.11) and (3.17) with that of (3.19), we take U to be some 
power of T, say, 

U = T 1 +V«* j (3.25) 

assuming that T is large enough for (3.10) to be satisfied. Then the errors in (3.17) and 
(3.11) are 0(Cq,/T), and the second error term in (3.19) is 0((1 + c ^)5 oc , i i log T/T), 
which subsumes OdlTHooT -2 ) in (3.11). On (again, crudely) setting 

5 6 (T) := (c* + —V max(<S F ('F),<S 0O)1 ('F)), (3.26) 

\ ) 

and 

:= a F /(2C F + l), (3.27) 

as in (3.24), one can verify directly that the net error is as claimed in (3.5). 

This completes the proof. □ 


3.2. Stage 2: Proof of Theorem 3.6. 

We first give the proof in the thin case, as it is significantly easier. 


3.2.1. Assume V is thin in G. 

Returning to (3.4), write as: 


1 ( r°° I" 1 / 7 ! 

„ T , sW = -(/ o - 1 jl *(*.!)» * 

say. Here we have set dz := dxdy/y 2 . We bound ^ by 

r 1 / T dv [' l/T dv 

W < / M0;y,t)|y-| « / S H (*)y a »y-l 

Jo y Jo y 

where we applied (2.19) (with X = (0,u;)). 

Recalling that Too = ( 1 w 1 z ) < T, we next deal with 

&[:=-[ tf(z,t) V dz. 
u Jr^xw 


— 7?2, 


S h {V) T~ aH , (3.28) 


(3.29) 


Note that the integral converges absolutely; for y —> oo, this is due to (3.1), while for 
y —> 0, we can again use (2.19). 

For ease of exposition, it is convenient at this point to first assume that T is right- 
K -invariant, that is, 

T(z,C) = *(z). (3.30) 

Below we detail the modihcations needed to handle the general case. 

Recall from (2.8) that 

E(z,s) ■= ^ 
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is the Eisenstein series at oo of a cusp of width u. Note that the defining sum 
converges absolutely and uniformly on compacta in the range 91c(s) > 6, since T is 
assumed to be a thin subgroup of G. In particular, E(z, s ) is regular at s = 1. 

Then, letting ^ be a fixed fundamental domain for r\H, we can “re-fold” and 
write (3.29) as 

= ~ Y\ [ *(z)ydz = - V [ y(z)3mMdz = <*,£(-, 1)). 
Setting 

HEis(V) ■= 1)), (3.31) 

we immediately see that = ^^(T), which combined with (3.28) gives: 

hT,6 W = dEisW + O (S H (V) T~ a ") , 

as claimed. 

Finally, we remove the assumption (3.30) and extend the proof to the general case. 
For a unit tangent vector ( at z, write 6 G [—7r, 7r) for the “angle” of ( = (g, measured 
from the vertical "f counterclockwise. We hrst decompose T(z,(C) in a Fourier series 
in (, writing: 

*(*,0 = £$„(*) Xn(C), (3.32) 

where Xn(Co) = e%n6 i n the above notation, and 

$n(z) ■= / y{zXe)x n ((e) dd. 

Note that each T,, lives in the space (T, 2 n) of functions on El given in (2.12). 

Returning to ^ in (3.29), we insert (3.32) (with ( = |), and “re-fold” again, 
obtaining: 

^ f Y. ^n(z) 3m(z) dz 

U 7 er oo \r •’1^ n&L 

= - Y Y / e 7 (^) 2n 3 f nr(7z) dz 

U n 7 er 00 \r , ' J? 

= £($ n ,£ B (-,l)). 

n 

Here, E 2 n (z,s ) are the “wight-277,” Eisenstein series given by the series (2.10); these 
all converge absolutely for 91e(s) > <5. The absolute convergence of the sum over n is 
guaranteed by (3.1) after taking two derivatives in 9 and noting that G &®(T\G). 
Then, on defining 

»Eis(V) ■= ^(^KM)), 


( 3 . 33 ) 
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(3.7) follows immediately. Note that if T is Jv-invariant, the two definitions (3.31) 
and (3.33) agree, and moreover /i£j S is actually a measure. In general, /iEi S is a 
distribution, as we need some derivatives of 'b n to ensure the convergence of (3.33). 
This completes the proof in the thin case. 

3.2.2. Case Y is a lattice in G. 

In this case, our analysis precedes in a similar fashion to that in [Sar81]. We begin 
with the following 

Lemma 3.34. For 1 < a < 1 + aq,, we have 



(3.35) 


n 


Proof. Starting with (3.4), write 



(3.36) 


where we have set 


hr{y) ■— V ' l{y>i/T}• 

Note the Mellin transform/inverse pair: 



and 



The first integral converges absolutely for 93c(s) — a > 1; the second is henceforth 
interpreted (after partial integration) as the absolutely convergent integral 



(3.37) 


Inserting (3.37) into (3.36) with the above convention gives 



(3.38) 


which is absolutely convergent in the range 1 < a < 1 + aq, using (3.1). 
Now we proceed as in the thin case, decomposing 




n 
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and “unfolding”; for each n G Z this gives 


'RxAH 


^n(z) y s dz = ($ n ,E n (-,s)^ . 


Summing over n and inserting into (3.38) gives (3.35), as claimed. 

To finish the proof of Theorem 3.6, we make the following definition: 


□ 


Enfai s) 


E n (z, s ) n ^ 0 

s) — n — 0 


which, again, is regular at s = 1 for all n. Then (3.35) can be rewritten as 

rps— 1 
S — 1 


hT,6('I / ) - log(T) + ^ — f 

nSZ Z 


T rt ,£ n (-,s))<is, (3.39) 


where we used that 

and hj^jhw ds = logT ' 

Now shifting the contour of integration to 93c(s) = |, we pick up residues from the 
simple pole at s = 1 and from the residual spectrum at s = <7j as in (2.9). The residue 
at s = 1 is 

£ ($»,£„(•, 1)) =: MgsW. (3.40) 

nEZ 

that is, this is our “second-order” contribution, and is a distribution (as opposed to 
a measure) since T is not assumed to be -finite. Note that if T is Jt-hxed, then 
(3.40) simplifies to just 

/*gs(*) = (*,&(•, u), (3.41) 

as claimed in (1.11). 

Each pole s = Uj contributes a residue T a J _ l /v,^), where 

^.(4>)-^($ n ,^., n ), (3.42) 

nEZ 

with <p a - tn the “weight-2n” residual form given in (2.11). Note that these distributions 
are exactly the same as those arising in Sarnak’s analysis [Sar81, p. 737]. 

We thus obtain 


A t T,e('h) 


K®) l °s(T) + 

j=i j 



v k n , E n (-, s)) ds. 
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Finally, taking absolute values and combining Cauchy Schwartz with (2.13), we can 
bound each of the terms in the last sum by 


1 

2711 


T 


s—1 


'( 1 / 2 ) 


s — 1 


* n ,E n (;s))ds 


< r- 1/2 ||$ 


n 2* 


On estimating ||^/ n || 2 <C we finally conclude the proof of Theorem 3.6. 
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4. Application 1: Moments of L-functions 


Theorem 1.17 now follows readily from Theorem 1.3, as we explain below. Re¬ 
call that we will illustrate the method on the simplest case of / being a weight-A; 
holomorphic Hecke cusp form on PSL 2 (Z); the calculation for general cuspidal GL(2) 
automorphic representations is similar. 

Let T(a; + iy) = \f(x + iy)\ 2 y k , and use (1.16) to write the left hand side of (1.18) 
as 


/ | L(f,±+it)\ 2 \W k (i + it,T)\ 2 dt 

Z7T Jr 



T (Ty + iy) 


dy_ 

5 

y 


where we have set 

T := VT 2 + 1 


for convenience. Theorem 1.3 can be applied directly to the range [1/T, oo), but 
the range (0,1/T) must be manipulated. Changing variables y (->• 1/y, using the 
automorphy of T that T(—1 /z) = V L(^), and changing y y T 2 y gives: 


" 1 / T dy 

T ('Ty + iy) — 
) V 



h/T 


T (-yT + iy) 


dy 

y 



f_y^_ + w\ dy_ 
V t 2 f 2 ) y 


Now we can apply Theorem 1.3 to both contributions, giving 


x 1 - [ \L(f, \ + it)\ 2 \Wk{\ + it,T)\ 2 dt 

^ Jr 

The hrst term is of course 

yr\c(^) 


2mo(*) l0 S( T ) + + 0 9 {T~*). 

(4.1) 


ll/ll 2 

voi(r\e) ’ 


where the norm is with respect to the Petersson inner product, ft remains to show 
that the second term, that is, the Eisenstein measure /rg^(T), can be expressed as 
special value (at the edge of the critical strip) of a symmetric square L-function. Note 
that T is a function on El, that is, as a function on G it is right-/!-invariant; therefore 
determined by the simpler expression (3.41) (or (1.11)). 


Proposition 4.2. With the above notation, we have 

= yolTtl) (x (sym2 ;2 / (2) ) - 

Clearly Proposition 4.2 inserted into (4.1) gives the right hand side of (1.18), com¬ 
pleting the proof of Theorem 1.17. 
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Proof of Proposition f.2. To evaluate 

z := 1 ‘saW = (l/lVAM)), 


use (1.9) to write 

1 = lim((|/|V, £(•,*))) = lim ((\f\ 2 y k , E(z, s)) - ||/|| 2 ') , (4.3) 

where 

V := vol(T\H). 

We analyze (\f\ 2 y k , E(z, s)) by standard Rankin-Selberg theory; more generally for 
two cusp forms / and g of weight k, we have 

(fgy k ,E{-,s)) = [ f(z)g(z)y k y s dx- f 

J roo\H y 

= [ [ ^ a / (n)e 2 "“e- 2TOS ^ a>)e- 2 " m e- 2 ™ ! '/ ! / s (l^f 

af(n)a g (n ) 

ry\ S~\~k — 1 

n>l 

= (47r)“ (s+fc “ 1) r(s + fc - 1)A(/<8> <?, s) = A if®g,s). 



[ e-^y-y— 

Vo 1/ 


When f = g, the Rankin-Selberg L-function factors (see, e.g., [Iwa97, p. 232]) as 

L(J®f,s) = hAL(sym 2 /,s). 

C(2s) 

Hence 

(l/l V, «(•,«)) = A(/®/» = ^A(sym 2 /,s), (4.4) 

where A(sym 2 /, s) is as in (1.19). Taking residues at s = 1 on both sides of (4.4) 
gives 


ll/ll 2 1 


V C(2) 

Inserting (4.5) and (4.4) into (4.3) gives 
X = lim ( 4^rA(sym 2 /, s) 

—i\C(2 s) ^ J 


A(sym 2 /, 1). 


1 1 

(7^1)02) 


A(sym 2 /,!)). 


(4.5) 
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Using £(s) — ^-j- —)• 7 as s —>■ 1 (Euler’s constant), and elementary calculus, we have 
that 


1 


A(sym 2 /, 1 ) A'(sym 2 /, 1 ) o A(sym 2 /,l) 

7 C(2) C(2) C(2) 



(r(sym 2 /, l) + 7 — 2U2) 


u V A 

on using (4.5) again. This completes the proof. 


c 


□ 


Note that one could also extend our method to Eisenstein series, and then evaluate 
the (weighted) fourth moment of the Riemann zeta function using Theorem 1.3. 


4.1. Subconvexity? 

We leave open the problem of extracting from the effective second moment (1.18) 
a sub convex bound 

«, ifi i/2 -" 

in the i-aspect. Such is already known [G 0086 , PS01] in the Maass case via trace 
formulae, explicit expansions, and shifted convolutions, but it would be interesting to 
give a new proof using only equidistribution. (Of course the general GL(2) subcon¬ 
vexity problem has been resolved [MV 10 ]; the interest here would be in the method 
used.) 

The key issue is that the archimedean factor |VV4 | 2 in (1.18) is a smooth weight, 
which does not allow truncation; if the weight could be replaced by a sharp cutoff 
while still having a power savings rate, then the subconvexity bound would follow 
immediately. This could be accomplished by finding a function ty x (T) so that 

[ V x (T)\W k (l+it,T)\ 2 dT = l w<x ; (4.6) 

Jr 

indeed, then one would multiply both sides of (1.18) by \E 'x(T) and integrate in T, 
obtaining 

/ \L{f,±+it)\ 2 dt = [ ty x (T) (CilogT + C 2 + 0(T~ V )) dT 

J\t\<x Jr 

= C[X log X + C' 2 X + o(x 1 - ,? '). 

Another approach is “shorten the interval,” that is, to replace the right hand side of 
(4.6) by l\ t -x\<Y, with Y < X 1- ^. 

Either way, one would need to invert the “W-transform”: 

T(T) ^ T(t) := / T(T)|W fc (i + R,T)| 2 dT. 

Jr 

Unfortunately, there are basic difficulties with said inversion, namely a Paley-Weiner 
(or Heisenberg uncertainty) analysis shows that the transform has insufficient har¬ 
monics to be invertible and functions \fW as above do not exist, even in this simple 
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holomorphic case! (Cf. the related discussion in, e.g., [GRS13, Appendix].) The case 
of non-holomorphic Hecke-Maass forms is seemingly even more complicated as the 
weights (1.15) will involve Bessel functions. 

A potential method to circumvent this issue (since our equidistribution theorem 
is proved in the generality of the unit tangent bundle) is to use all the harmonics 
afforded us by /, that is, by applying Maass raising and lowering operators. This 
does not change the L-function, but results in effective second moments with a large 
span of weight functions W. One can hope that enough combinations of these can 
recover the desired sharp cutoff functions T x , and we plan to return to this question 
later. 
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5. Application 2: Counting and Non-Equidistribution 
5.1. Proof of Theorem 1.24. 

As the method of counting from equidistribution is by now completely standard, 
we give a brief sketch (only the setup is not completely obvious). Let G = SL 2 (R) 
be the spin double-cover G ——>■ SOq(M) of the (identity component of the) special 
orthogonal group preserving an indefinite ternary quadratic form Q. Let T < G be 
discrete, Zariski-dense, geometrically finite, and have at least one cusp, and given 
x 0 G M 3 , let O = x 0 t(r) be a discrete orbit. Let H = Stab^xo be the stabilizer of 
xo in G, and let T# '■= T D H be the stabilizer in T. Given an archimedean norm 
|| • || on M 3 , we obtain a norm-T ball Bt in H\G as in (1.28). Our goal is to estimate 
Afo(T) — \Of\ Bt \, that is, 

a r 0 (T) = { 7 6r H \r : ||x ot ( 7 )ll < t}. 

Thanks to the discussion in §1.3.1 (see Table 1), there are only two new cases to 
prove, both occurring only when H = Stabc Xo is diagonalizable. We can choose the 
spin cover t up to conjugation, and hence can assume that H = A. Having made such 
a choice, we will henceforth drop i from the notation. To handle the two lacunary 
cases, we may assume that T# is trivial. (In principle, Th could be finite.) 

We then break Mo{T) into two contributions as follows. Recalling the shear s t in 
(1.1), we decompose each g G G = ANK uniquely as g = as t k , and write 

G ± := {g = as t k G G : a G A ± }. 

Hence we can write 

Mo(T)=U+(T)+Mo(T), 

say, where 

A/g(T) := {7 erne* : ||x„ 7 || < T}, 

and treat only Mq{T ), the other contribution being the same (after conjugation). 

If T is a lattice, then the “lacunary” case occurs only when both 0 and oo (that is, 
the two endpoints of A) are cusps of T. When T is thin, the “lacunary” cases occur 
when at least one of 0, oo is a cusp; Lemma 1.30 forces the other endpoint to be either 
a cusp or in the free boundary. If oo, say, is in the free boundary, then Mq (T) gives 
a contribution of order N s , as described below (1.32). So to restrict attention to the 
lacunary case, we assume that oo is a cusp of T. Now we continue with the standard 
smoothing/unsmoothing argument applied to the equidistribution theorem. For ease 
of exposition, assume that the norm || • || is right-/!-invariant. (This assumption is 
standard to relax.) 

Let tj} : G —>■ M>o be a right-il-invariant bump function supported in an e > 0 ball 
about the origin in G/K with f c if) = 1. Set \l/( g ) := so that J r ( , T = 1. 

Let 


fr(9) 1 {l|x 0 9ll<T gec+j, 
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FIGURE 4. The orbit 'x.qvjY 00 ^ for w = e inside Bt C H. 


ami F] (g) = Y. 1 s!r('is)- Then 

T+(e) = M*(T) 

and 

{?}, S') = Ar+(r)(i + o( £ )), 

since T is a bump function about the origin. Unfolding the inner product gives 


(J-+,>K) = [ ff{g)'Hg)dg = [ /+(*,) 

Jg Jst 


I a£A+ 


4 >(ast)da 


ds t . 


1 G Js t 

Applying Theorem 1.3 to the bracketed term and integrating in t completes the sketch 
of the two remaining cases of Theorem 1.24. 


5.2. Proof of Proposition 1.36. 

As above, let the stabilizer H = A be diagonalizable, let oo be a cusp of T, and 
assume for ease of exposition that the norm || • || is right-A'-invariant. The statement 
of Proposition 1.36 assumes that T < SL 2 (Z) is integral and thin. For an integer 
q > 1, let T (q) < T be its level-g principal congruence subgroup, and for a fixed 
w G T/r(g), let 

Oq.jjj := X 0 t77 r(q) 

be the congruence coset orbit. The corresponding counting function is then 

Nq,w{T) \= \Oq^ FI Bt\. 

We claim that this count depends on tz7, that is, is not distributed uniformly among 
the cosets. 

One way to see this is to unravel the formalism of the previous proof, and note 
that C\ in (1.26) is essentially the evaluation at s = 1 and some z = G El of the 
(unregularized) Eisenstein series E(z,s ) for T(g); there is no reason for these values 
to coincide for different An even easier way to see the non-equidistribution is to 
look at one picture. 
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Assume for simplicity that ( 1 f) < T; then the isotropy group of oo in Y (q) is 

Foo, q ■= Cf) = r(g) nToo. 

Certainly the orbit O q ^ contains the points 

XotuToog C 

SO 

J^ q ,A T ) > Ixo^roo,, ns T |. 

Converting Figure 3c: from the disk D to the hyperbolic plane El, we show in Figure 
4 how the shaded region Bt contains the orbit points vjY^^ for w — e. A moment’s 
reflection (or rather, translation) shows that (1.37) holds for this orbit. 

Something similar would happen if one were to take congruence cosets with the 
subgroups r 0 (?) := {(“d) £ T : c = 0(g)}, say, instead of Y(q). The isotropy group 
Too would now remain unchanged, but the same picture shows that, for the identity 
coset (with w = e), the number of points in an orbit is T, whereas is average 
count should be of order T/q. We leave it as an interesting challenge to develop sieve 
methods which apply to this non-uniformly distributed (in the archimedean ordering) 
setting. 
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